Skip to content

gh-139353: Add Objects/unicode_codecs.c file#141469

Draft
vstinner wants to merge 3 commits intopython:mainfrom
vstinner:unicode_codecs
Draft

gh-139353: Add Objects/unicode_codecs.c file#141469
vstinner wants to merge 3 commits intopython:mainfrom
vstinner:unicode_codecs

Conversation

@vstinner
Copy link
Member

@vstinner vstinner commented Nov 12, 2025

  • Copy functions from unicodeobject.c:

    • _PyUnicode_UTF8()
    • PyUnicode_UTF8(), PyUnicode_SET_UTF8()
    • PyUnicode_UTF8_LENGTH(), PyUnicode_SET_UTF8_LENGTH()
    • get_latin1_char()
    • findchar()
  • Share code with unicodeobject.c:

    • _PyUnicode_FromUCS1()
    • _PyUnicode_FiniEncodings()
    • _PyUnicode_TranslateCharmap()
    • _Py_EncodingMapType;

* Copy functions from unicodeobject.c:

  * _PyUnicode_UTF8()
  * PyUnicode_UTF8(), PyUnicode_SET_UTF8()
  * PyUnicode_UTF8_LENGTH(), PyUnicode_SET_UTF8_LENGTH()
  * get_latin1_char()
  * findchar()

* Share code with unicodeobject.c:

  * _PyUnicode_FromUCS1()
  * _PyUnicode_FiniEncodings()
  * _PyUnicode_TranslateCharmap()
  * _Py_EncodingMapType;
@vstinner
Copy link
Member Author

vstinner commented Nov 12, 2025

Line count:

$ wc -l Objects/unicode_codecs.c Objects/unicodeobject.c 
  6671 Objects/unicode_codecs.c
  8541 Objects/unicodeobject.c
 15212 total

In PR gh-139354, I created 3 files for codecs:

  • Objects/unicode_codecs_win.c (809 lines)
  • Objects/unicode_codecs_utf.c (2,171 lines)
  • Objects/unicode_codecs.c (3,239 lines)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant